Back

Biology Methods and Protocols

Oxford University Press (OUP)

Preprints posted in the last 30 days, ranked by how well they match Biology Methods and Protocols's content profile, based on 53 papers previously published here. The average preprint has a 0.08% match score for this journal, so anything above that is already an above-average fit.

1
Explainable, Lightweight Deep Learning for Colorectal Cancer Microsatellite Instability Screening in Low-Resource Settings

Adegbosin, O. T.; Patel, H.

2026-04-20 oncology 10.64898/2026.04.18.26350809 medRxiv
Top 0.1%
18.4%
Show abstract

BackgroundMicrosatellite stability status determination is important for prognostication and therapeutic decision making in colorectal cancer management, but the conventional methods for this assessment are not readily available, especially in low- and middle-income countries. Deep learning (DL) models have been proposed for addressing this problem; however, potential computational cost due to model complexity and inadequate explainability may limit their adoption in low-resource settings. This study explored the potential of explainable lightweight models for detection of microsatellite instability in colorectal cancer. MethodsDL models were trained using a public dataset of colorectal cancer histology images and then used to classify a set of test images into one of two classes: microsatellite instability or microsatellite stability. The models were compared for efficiency. Gradient-weighted class activation mapping (Grad-CAM) was used to interpret the models decision making. ResultsThe simpler convolutional neural network (CNN) trained from scratch had modest performance (accuracy=0.757, area under receiver-operating characteristic curve [AUROC]=0.840). With an attention mechanism added, these values increased, but specificity and sensitivity reduced. Pretrained models performed better than the ones trained from scratch, and EfficientNet_B0 had the best balance of high performance and low computational requirements (accuracy=0.936, AUROC=0.990, negative predictive value=0.923, specificity=0.953, 4,010,000 trainable parameters, 0.38 gigaFLOPs). However, a simple CNN model with attention mechanism had the best interpretability based on Grad-CAM. ConclusionThis study demonstrated that DL models that are lightweight when compared to previously proposed ones can be useful for colorectal cancer microsatellite instability screening in resource-limited settings while balancing performance and computational efficiency.

2
New three-dimensional preclinical models to understand and treat liver cancers activated for the β-catenin pathway

Bou Malham, V.; Leandre, F.; Hamimi, A.; Lagoutte, I.; Bouchet, S.; Gougelet, A.; Colnot, S.; Desbois-Mouthon, C.

2026-04-03 cell biology 10.64898/2026.04.01.715868 medRxiv
Top 0.1%
15.3%
Show abstract

Background & aimsConstitutive activation of the {beta}-catenin pathway is a determining feature in the pathogenesis of two primary liver cancers, namely HCC and hepatoblastoma (HB). Activating alterations in CTNNB1 gene and, to a lesser extent, inhibiting alterations in APC gene are observed in 30 to 40% of HCC cases and 80 to 90% of HB cases. For both tumours, therapeutic management is far from optimal. Therefore, relevant experimental models are needed to increase our knowledge and test new therapeutic approaches. MethodsOrganoids and tumouroids were established from APC{Delta}hep and {beta}cat{Delta}ex3 mouse models, which are clinically relevant models for {beta}-catenin-activated HCC and mesenchymal HB. We developed a new methodological approach based on a dynamic suspension culture in a rotating bioreactor. Morphological and molecular characteristics and sensitivity to WNTinib, a treatment already successfully tested on human HCC and HB tumouroids, were evaluated by histology, immunohistochemistry, immunofluorescence, and RT-qPCR. ResultsThis easy-to-implement methodology allows for the rapid generation of a large number of organoids and tumouroids that are uniform in size and show no signs of cell death in their core. The robustness of the methodology is illustrated by the maintenance of the histological architecture, cell diversity and gene expression in organoids and tumouroids in comparison with the native liver tissues. In addition, the value of the HCC-derived tumouroids for evaluating cancer treatment was assessed based on their responsiveness to the {beta}-catenin antagonist WNTinib. ConclusionsThe organoids and tumouroids that we present here are new reliable in vitro cancer models, recapitulating the main features of {beta}-catenin-driven HCC and mesenchymal HB. They can be integrated into an appropriate platform for drug screening and could enable the development of "a la carte" therapies that are urgently needed for these indications. Impact and implicationsThis study addresses the critical need for representative in vitro models to investigate {beta}-catenin-driven liver cancers. The organoids and tumouroids developed here are particularly valuable for researchers seeking robust, reproducible models that accurately reflect the cellular diversity and gene expression profiles of native liver tumours. These findings have practical applications in exploring cancer mechanisms, screening new drugs, optimizing personalized treatment strategies, and reducing reliance on animal models, which ultimately benefits patients. HighlightsO_LIEasy and rapid generation of mouse liver organoids and tumouroids from {beta}-catenin activated tumours using culture in a bioreactor C_LIO_LITumouroids preserve histology, cell diversity, and gene expression of native tissue C_LIO_LIHCC-derived tumouroids respond to {beta}-catenin inhibitor WNTinib C_LIO_LIThese reliable 3D models reduce reliance on animal experiments for drug testing C_LI

3
AENEAS Project: First real-time intraoperative application of machine vision-based anatomical guidance in neurosurgery

Sarwin, G.; Ricciuti, V.; Staartjes, V. E.; Carretta, A.; Daher, N.; Li, Z.; Regli, L.; Mazzatenta, D.; Zoli, M.; Seungjun, R.; Konukoglu, E.; Serra, C.

2026-04-11 surgery 10.64898/2026.04.09.26348607 medRxiv
Top 0.1%
12.4%
Show abstract

Background and Objectives: We report the first intraoperative deployment of a real-time machine vision system in neurosurgery, derived from our previous anatomical detection work, automatically identifying structures during endoscopic endonasal surgery. Existing systems demonstrate promising performance in offline anatomical recognition, yet so far none have been implemented during live operations. Methods: A real-time anatomy detection model was trained using the YOLOv8 architecture (Ultralytics). Following training completion in the PyTorch environment, the model was exported to ONNX format and further optimized using the NVIDIA TensorRT engine. Deployment was carried out using the NVIDIA Holoscan SDK, the system ran on an NVIDIA Clara AGX developer kit. We used the model for real-time recognition of intraoperative anatomical structures and compared it with the same video labelled manually as reference. Model performance was reported using the average precision at an intersection-over-union threshold of 0.5 (AP50). Furthermore, end-to-end delay from frame acquisition to the display of the annotated output was measured. Results: A mean AP50 of 0.56 was achieved. The model demonstrated reliable detection of the most relevant landmarks in the transsphenoidal corridor. The mean end-to-end latency of the model was 47.81 ms (median 46.57 ms). Conclusion: For the first time, we demonstrate that clinical-grade, real-time machine-vision assistance during neurosurgery is feasible and can provide continuous, automated anatomical guidance from the surgical field. This approach may enhance intraoperative orientation, reduce cognitive load, and offer a powerful tool for surgical training. These findings represent an initial step toward integrating real-time AI support into routine neurosurgical workflows.

4
Assessing medication-related burden and medication adherence among older patients from Central Nepal: A machine learning approach

Giri, R.; Agrawal, R.; Lamichhane, S. R.; Barma, S.; Mahatara, R.

2026-04-23 geriatric medicine 10.64898/2026.04.22.26351447 medRxiv
Top 0.1%
8.9%
Show abstract

We are pleased to submit our Original article entitled "Assessing medication-related burden and medication adherence among older patients from Central Nepal: A machine learning approach" for consideration in your esteemed journal. In this paper, we assessed medication burden using validated Living with medicines Questionnaire (LMQ-3) and medication adherence using Adherence to Medication refills (ARMS) Scale. In this paper we analysed our result through machine learning approach in spite of traditional statistical approach to identify the complex factors influencing both. Six ML architectures (Ordinary Least Square, LightGBM, Random Forest, XGBoost, SVM, and Penalized linear regression) were employed to predict ARMS and LMQ scores using various socio-demographic, clinical and medication-related predictive features. Model explainability was provided through SHAP (Shapley Additive exPlanations). Our study identified the moderate medication burden with moderate non-adherence among older adults. Requiring assistance for medication and polypharmacy were the strongest drivers for the medication burden and non-adherence. The high predictive accuracy by ML suggests the appropriate clinical intervention like deprescribing to cope with the high prevalent medication burden and non-adherence among older adults in Nepal.

5
A Comparative Study in Surgical AI: Datasets, Foundation Models, and Barriers to Med-AGI

Skobelev, K.; Fithian, E.; Baranovski, Y.; Cook, J.; Angara, S.; Otto, S.; Yi, Z.-F.; Zhu, J.; Donoho, D. A.; Han, X. Y.; Mainkar, N.; Masson-Forsythe, M.

2026-03-28 surgery 10.64898/2026.03.26.26349455 medRxiv
Top 0.1%
8.6%
Show abstract

Recent Artificial Intelligence (AI) models have matched or exceeded human experts in several benchmarks of biomedical task performance, but have lagged behind on surgical image-analysis benchmarks. Since surgery requires integrating disparate tasks --- including multimodal data integration, human interaction, and physical effects --- generally-capable AI models could be particularly attractive as a collaborative tool if performance could be improved. On the one hand, the canonical approach of scaling architecture size and training data is attractive, especially since there are millions of hours of surgical video data generated per year. On the other hand, preparing surgical data for AI training requires significantly higher levels of professional expertise, and training on that data requires expensive computational resources. These trade-offs paint an uncertain picture of whether and to-what-extent modern AI could aid surgical practice. In this paper, we explore this question through a case study of surgical tool detection using state-of-the-art AI methods available in 2026. We demonstrate that even with multi-billion parameter models and extensive training, current Vision Language Models fall short in the seemingly simple task of tool detection in neurosurgery. Additionally, we show scaling experiments indicating that increasing model size and training time only leads to diminishing improvements in relevant performance metrics. Thus, our experiments suggest that current models could still face significant obstacles in surgical use cases. Moreover, some obstacles cannot be simply ``scaled away'' with additional compute and persist across diverse model architectures, raising the question of whether data and label availability are the only limiting factors. We discuss the main contributors to these constraints and advance potential solutions.

6
Multi-Stain Fusion of Histopathology Images Using Deep Learning for Pediatric Brain Tumor Classification

Spyretos, C.; Tampu, I. E.; Lindblad, J.; Haj-Hosseini, N.

2026-04-14 pathology 10.64898/2026.04.10.717785 medRxiv
Top 0.1%
8.5%
Show abstract

AO_SCPLOWBSTRACTC_SCPLOWThe classification of pediatric brain tumors is investigated using deep learning on hematoxylin and eosin (H&E) and antigen Ki-67 (Ki-67) whole slide images (WSIs) from the Childrens Brain Tumor Network (CBTN) dataset. A total of 1,662 unregistered WSIs (1,047 H&E and 615 Ki-67 images) were analyzed, including low-grade glioma/astrocytoma (grades 1, 2) (LGG), high-grade glioma/astrocytoma (grades 3, 4) (HGG), medulloblastoma (MB), ependymoma (EP) and ganglioglioma. The The aim of this study was to effectively classify pediatric brain tumors using H&E and Ki-67 WSIs individually, and to investigate whether early, intermediate, and late fusion could improve the predictive performance. From each WSI, 224x 224 pixel patches were extracted, and the instance (patch)-level features were obtained using the histology foundation model CONCHv1_5. The instances were aggregated using clustering-constrained attention multiple instance learning (CLAM) for patient-level classification. Model interpretability and explainability was assessed through attention heatmaps, cell density and Ki-67 labelling index (LI) maps. In the binary grade classification between LGG and HGG, the intermediate concatenation fusion achieved the best performance with a balanced accuracy of 0.88 {+/-} 0.05, (p < 0.005) compared to the single-stain models (H&E: 0.84 {+/-} 0.05, Ki-67: 0.86 {+/-} 0.05). For the 5-class tumor type classification, the one-hidden layer late fusion learning model achieved the highest balanced accuracy of 0.83 {+/-} 0.04 (p < 0.005), outperforming the single-stain models (H&E: 0.77 {+/-} 0.05, Ki-67: 0.74 {+/-} 0.05). Overall, most of the fusion approaches outperformed the single-stain models in both classification tasks (p < 0.005). The Ki-67 attention maps demonstrated moderate to strong Spearman correlation ({rho} = 0.576 - 0.823) with the cell density and Ki-67 LI maps, suggesting that these features are associated with the models predictions, although additional features may contribute. The results show that H&E and Ki-67 images provide complementary information, and most of the multi-stain fusion approaches using deep learning improve pediatric brain tumor diagnosis.

7
Elder-Sim: A Psychometrically Validated Platform for Personality-Stable Elderly Digital Twins

Wang, J.; Yang, Z.; Zhu, Z.; Zhu, X.; Huang, Z.; Wang, H.; Tian, L.; Cao, Y.; Qu, X.; Qi, X.; Wu, B.

2026-03-30 geriatric medicine 10.64898/2026.03.25.26349036 medRxiv
Top 0.1%
8.4%
Show abstract

Background: LLMs enable patient-facing conversational agents, creating a pathway toward digital twins that capture older adults' lived experiences and behavioral responses across time. A central barrier is personality drift---inconsistent trait expression across repeated interactions---which undermines reliability of generated trajectories and intervention-response simulation in geriatric care. Objective: To develop ELDER-SIM, a multi-role elderly-care conversational platform for building personality-stable digital twin agents, and to propose a psychometric validation framework for quantifying personality consistency in LLM-based agents. Methods: ELDER-SIM was implemented via n8n workflow orchestration with local LLM inference (Ollama/vLLM), integrating (1) Big Five (OCEAN) trait specifications, (2) a Cognitive Conceptualization Diagram (CCD) grounded in Beck's CBT framework, and (3) a MySQL-based long-term memory module. Ablation studies across four conditions---Baseline, +Memory, +CCD, and +LoRA (fine-tuned on 19,717 instruction pairs from CHARLS)---were evaluated via Cronbach's $\alpha$, ICC, and role discrimination accuracy. Results: Personality measurement reliability was acceptable to excellent across conditions (Cronbach's : 0.70-0.94), with consistently high test-retest stability (ICC: 0.85- 2 0.96). Role discrimination improved stepwise from 83.3% (Baseline) to 88.9% (+Memory), 94.4% (+CCD), and 97.2% (+LoRA). CCD produced the largest gain in internal consistency (mean 0.702[-&gt;]0.892), while LoRA achieved the highest overall internal consistency ( 0.940) and ICC (0.958). Conclusions: ELDER-SIM provides a psychometrically validated approach for constructing personality-consistent elderly digital twin agents. Structured cognitive modeling and domain adaptation reduce personality drift, supporting reliable longitudinal simulation for elderly mental health care and reproducible in silico evaluation before clinical deployment.

8
Quantifying PD1 saturation by PDL1 in tumor tissue using a novel RNA aptamer-based assay

Veeramani, S.; Yin, C.; Yu, N.; Coleman, K. L.; Smith, B. J.; Weiner, G. J.

2026-04-08 immunology 10.64898/2026.04.06.716702 medRxiv
Top 0.1%
7.1%
Show abstract

BackgroundTherapeutic agents targeting the PD1-PDL1 interaction are of great clinical value, however accurately predicting which patients are most likely to benefit is challenging. Improved predictive biomarkers for anti-PD1 therapy are clearly needed. Quantifying PD1 saturation by PDL1 in tumor tissue has the potential to serve as such a biomarker. Here we report a novel bioassay called the PD1 Ligand Receptor Complex Aptamer (LIRECAP) assay and demonstrate it can be used to quantify the saturation of PD1 by PDL1 in formalin-fixed paraffin-embedded tumor biospecimens. ResultsThe PD1 LIRECAP assay was developed by identifying a pair of RNA aptamers. One aptamer preferentially binds to unoccupied PD1 (P aptamer) and the other to the PD1-PDL1 complex (C aptamer). P and C aptamers were added together to a formalin-fixed sample, and bound aptamer extracted. A 2-color qRT-PCR assay using a single set of primers was used to determine the ratio of the sample-bound C to P aptamers (C:P ratio) which reflected PD1 saturation by PDL1 in the sample. Quantification of PD1 saturation by PDL1 as determined by the PD1 LIRECAP assay correlated closely with PD1-mediated signaling and PD1-PDL1 proximity. Analysis of sarcoma FFPE biospecimens confirmed the assay is technically reproducible on clinical biospecimens. There were significant differences in PD1 saturation by PDL1 between patients as well as considerable intratumoral heterogeneity. ConclusionsThe PD1 LIRECAP assay is novel assay that can be used to quantify PD1 saturation by PDL1 in clinical biospecimens. The assay is technically feasible, reproducible, and has the potential to serve as a superior predictive biomarker for PD1/PDL1-based therapy. Similar assays based on this platform could be used in other systems and settings to quantify interaction between two molecules.

9
Comparison of foundation models and transfer learning strategies for diabetic retinopathy classification

Li, L. Y.; Lebiecka-Johansen, B.; Byberg, S.; Thambawita, V.; Hulman, A.

2026-04-20 health informatics 10.64898/2026.04.17.26351092 medRxiv
Top 0.1%
6.8%
Show abstract

Diabetic retinopathy (DR) is a leading cause of vision impairment, requiring accurate and scalable diagnostic tools. Foundation models are increasingly applied to clinical imaging, but concerns remain about their calibration. We evaluated DINOv3, RETFound, and VisionFM for DR classification using different transfer learning strategies in BRSET (n = 16,266) and mBRSET (n = 5,164). Models achieved high discrimination in binary classification (normal vs retinopathy) in BRSET (AUROC 0.90-0.98), with DINOv3 achieving the best under full fine-tuning (AUROC 0.98 [95% CI: 0.97-0.99]). External validation on mBRSET showed decreased performance for all models regardless of the fine-tuning strategy (AUROC 0.70-0.85), though fine-tuning improved performance. Foundation models achieved strong discrimination but poor calibration, generally overestimating DR risk. While the generalist model, DINOv3, benefited from deeper fine-tuning, miscalibration remained evident. These findings underscore the need to improve calibration and the comprehensive evaluation of foundation models, which are essential in clinical settings. Author summaryArtificial intelligence is increasingly being used to detect eye diseases such as diabetic retinopathy from retinal images. Recent advances have introduced "foundation models," which are trained on large datasets and can be adapted to new tasks. We aimed to evaluate how well these models perform in a clinical prediction context, with a focus not only on accuracy but also on how reliably they estimate disease risk. In this study, we compared different types of foundation models using two independent datasets from Brazil. We found that while these models were generally good at distinguishing between healthy and diseased eyes, their predicted risks were often poorly calibrated. In other words, the estimated probabilities did not consistently reflect the true likelihood of disease. We also examined whether adapting the models to the target population could improve performance. Although this approach led to improvements, calibration issues remained. However, post-training correction improved the agreement between predicted risks and observed outcomes. Our findings highlight an important gap between model performance and clinical usefulness. We suggest that improving the reliability of risk estimates is essential before such systems can be safely used in healthcare.

10
Analysis and Mitigation of Equipment-induced Shortcuts in AI Models for Laparoscopic Cholecystectomy

Protserov, S.; Repalo, A.; Mashouri, P.; Hunter, J.; Masino, C.; Madani, A.; Brudno, M.

2026-04-24 surgery 10.64898/2026.04.22.26351545 medRxiv
Top 0.1%
6.7%
Show abstract

Machine learning models have seen a lot of success in medical image segmentation domain. However, one of the challenges that they face are confounders or shortcuts: spurious correlations or biases in the training data that affect the resulting models. One example of such confounders for surgical machine learning is the setup of surgical equipment, including tools and lighting. Using the task of identification of safe and dangerous zones of dissection in laparoscopic cholecystectomy images and videos as a use-case, we inspect two equipment-induced biases: the presence of surgical tools in the field of view and the position of lighting. We propose methods for evaluating the severity of these biases and augmentation-based methods for mitigating them. We show that our tool bias mitigations improve the models' consistency under tool movements by 9 percentage points in the most inconsistent cases, and by 4 percentage points on average. Our lighting bias mitigations help reduce fraction of true dangerous zone pixels that may be predicted as safe under light changes from 5% to 1.5%, without compromising segmentation quality.

11
The impact of non-invasive prehabilitation before surgery on emotional well-being in neuro-oncology patients: Insights from the Prehabilita project

Brault-Boixader, N.; Roca-Ventura, A.; Delgado-Gallen, S.; Buloz-Osorio, E.; Perellon-Alfonso, R.; Hung Au, C.; Bartres-Faz, D.; Pascual-Leone, A.; Tormos Munoz, J. M.; Abellaneda-Perez, K.; Prehabilita Working Group,

2026-04-12 oncology 10.64898/2026.04.08.26350382 medRxiv
Top 0.1%
6.6%
Show abstract

Prehabilitation (PRH) is a preoperative process aimed at optimizing patients functional capacity to improve surgical outcomes and overall well-being. While its physical and cognitive benefits are increasingly documented, its emotional impact, particularly in neuro-oncology patients, remains less explored. This study assessed the psychological effects of a PRH program on 29 brain tumor patients. The primary outcome, emotional well-being, was measured using quality of life and emotional distress metrices. Secondary outcomes included perceived stress levels and control attitudes. Additionally, qualitative data from structured interviews provided further insights into the psychological effects of the intervention. The results indicated significant improvements in quality of life and reductions in emotional distress, particularly among women. While perceived stress levels remained stable, control attitudes showed an increase. Qualitative analysis further highlighted the positive changes in the control sense and identified additional factors, such as the importance of social support sources during the PRH process. Overall, these findings suggest that PRH interventions play a significant role in enhancing emotional well-being among neuro-oncological patients in the preoperative phase. These results underscore the importance of implementing comprehensive and personalized PRH approaches to optimize clinical status both before and after surgery, thereby promoting sustained psychological benefits in this population. This study is based on data collected at Institut Guttmann in Barcelona in the context of the Prehabilita project (ClinicalTrials.gov identifier: NCT05844605; registration date: 06/05/2023).

12
Comparing prognostic performance and reasoning between large language models and physicians

Gjertsen, M.; Yoon, W.; Afshar, M.; Temte, B.; Leding, B.; Halliday, S.; Bradley, K.; Kim, J.; Mitchell, J.; Sanders, A. K.; Croxford, E. L.; Caskey, J.; Churpek, M. M.; Mayampurath, A.; Gao, Y.; Miller, T.; Kruser, J. M.

2026-04-25 intensive care and critical care medicine 10.64898/2026.04.17.26350898 medRxiv
Top 0.1%
6.2%
Show abstract

Importance: Physicians routinely prognosticate to guide care delivery and shared decision making, particularly when caring for patients with critical illnesses. Yet, these physician estimates are prone to inaccuracy and uncertainty. Artificial intelligence, including large language models (LLMs), show promise in supporting or improving this prognostication. However, the performance of contemporary LLMs in prognosticating for the heterogeneous population of critically ill patients remains poorly understood. Objective: To characterize and compare the performance of LLMs and physicians when predicting 6-month mortality for hospitalized adults who survived critical illness. Design: Embedded mixed methods study with elicitation and comparison of prognostic estimates and reasoning from LLMs and practicing physicians. Setting: The publicly available, deidentified Medical Information Mart for Intensive Care (MIMIC)-IV v2.2 dataset. Participants: We randomly selected 100 hospitalizations of adult survivors of critical illness. Four contemporary LLMs (Open AI GPT-4o, o3- and o4-mini, and DeepSeek-R1) and 7 physicians provided independent prognostic estimates for each case (1,100 total estimates; 400 LLM and 700 physician). Main outcomes and measures: For each case, LLMs and physicians used the hospital discharge summary and demographics to predict 6-month mortality (yes/no) and provide their reasoning (free text). We assessed prognostic performance using accuracy, sensitivity, and specificity, and used inductive, qualitative content analysis to characterize reasonings. Results: Mean physician accuracy for predicting mortality was 70.1% (95% CI 63.7-76.4%), with sensitivity of 59.7% (95% CI 50.6-68.8%) and specificity of 80.6% (95% CI 71.7-88.2%). The top-performing LLM (OpenAI o4-mini) accuracy was 78.0% (95% CI 70.0-86.0%), with sensitivity of 80.0% (95% CI 67.4-90.2%) and specificity of 76.0% (95% CI 63.3-88.0%). The difference between mean physician and top-performing LLM accuracy was not statistically significant (p = 0.5). Qualitative analysis revealed similar patterns in LLM and physician expressed reasoning, except that physicians regularly and explicitly reported uncertainty while LLMs did not. Conclusion and Relevance: In this study, LLMs and physicians achieved comparable, moderate performance in predicting 6-month mortality after critical illness, with similar patterns in expressed reasoning. Our findings suggest LLMs could be used to support prognostication in clinical practice but also raise safety concerns due to the lack of LLM uncertainty expression.

13
Automated Detection of Dental Caries and Bone Loss on Periapical and Bitewing Radiographs using a YOLO Based Deep Learning Model

Alqaderi, H.; Kapadia, U.; Brahmbhatt, Y.; Papathanasiou, A.; Rodgers, D.; Arsenault, P.; Cardarelli, J.; Zavras, A.; Li, H.

2026-04-17 dentistry and oral medicine 10.64898/2026.04.12.26350726 medRxiv
Top 0.1%
5.2%
Show abstract

BackgroundDental caries and periodontal disease represent the most prevalent global oral health conditions, collectively affecting several billion people. The diagnostic interpretation of dental radiographs, a cornerstone of modern dentistry, is associated with considerable inter-observer variability. In routine clinical practice, clinicians are required to evaluate a high volume of radiographic images daily, a cognitively demanding task in which diagnostic fatigue, time constraints, and the inherent complexity of overlapping anatomical structures can lead to the inadvertent oversight of early-stage pathologies. Artificial intelligence (AI) offers a transformative opportunity to augment clinical decision-making by providing rapid, objective, and consistent radiographic analysis, thereby serving as a tireless adjunct capable of flagging findings that may be missed during routine human inspection. MethodsThis study developed and validated a deep learning system for the automated detection of dental caries and alveolar bone loss using a dataset of 1,063 periapical and bitewing radiographs. Two separate YOLOv8s object detection models were trained and evaluated using a rigorous 5-fold cross-validation methodology. To align with the clinical use-case of a screening tool where high sensitivity is paramount, a custom image-level evaluation criterion was employed: a true positive was recorded if any predicted bounding box had a Jaccard Index (IoU) > 0 with any ground truth annotation. Model performance was systematically evaluated at confidence thresholds of 0.10 and 0.05. ResultsAt a confidence threshold of 0.05, the caries detection model achieved a mean precision of 84.41% ({+/-}0.72%), recall of 85.97% ({+/-}4.72%), and an F1-score of 85.13% ({+/-}2.61%). The alveolar bone loss model demonstrated exceptionally high performance, with a mean precision of 95.47% ({+/-}0.94%), recall of 98.60% ({+/-}0.49%), and an F1-score of 97.00% ({+/-}0.46%). ConclusionThe YOLOv8-based models demonstrated high accuracy and high sensitivity for detecting dental caries and alveolar bone loss on periapical radiographs. The system shows significant potential as a reliable automated assistant for dental practitioners, helping to improve diagnostic consistency, reduce the risk of missed pathology, and ultimately enhance the standard of patient care.

14
Algorithm-Based Model for Gastrointestinal and Liver Histopathological Analysis Using VGG16 and Specialized Stains: Statistical Validation of Thresholds in AI-Driven Digital Pathology

Adeluwoye, A. O.; Gbadegesin, M. O.; James, F. M.; Otegbade, P. S.; Alabetutu, A.

2026-04-11 pathology 10.64898/2026.04.08.26350456 medRxiv
Top 0.1%
4.9%
Show abstract

Digital pathology, coupled with advanced image recognition algorithms, represents a transformative frontier in histopathological diagnosis. This sub-Saharan African laboratorys exploratory study investigates the application of a Convolutional Neural Network (CNN) model, specifically leveraging the VGG16 architecture with transfer learning, for automated analysis and classification of selected gastrointestinal (GIT) and liver tissue samples, incorporating both routine and specialized staining protocols. The study utilized a dataset comprising 114 samples (18 liver, 96 GIT images) derived from archival formalin-fixed paraffin-embedded tissue blocks at University College Hospital, Ibadan, Nigeria. Specialized staining techniques included Alcian Yellow for GIT mucin visualization and Massons Trichrome for liver fibrosis assessment, alongside conventional H&E staining. Model performance was evaluated using statistical methodologies including Wilson Score confidence intervals (CI), Bayesian probability assessment, and effect size analysis. Results reveal a striking dichotomy in model performance. The GIT tissue model achieved perfect classification accuracy (100% test accuracy) with exceptional statistical significance (Z=10.0, p<0.0001), Wilson CI [96.29%, 99.99%], Cohens h=1.571, and Bayesian probability >99.99%. Conversely, the liver tissue model demonstrated diagnostic failure (42.86% test accuracy), with Z=-1.428, p=0.9236, Wilson CI [33.59%, 52.65%], Cohens h=-0.144, and Bayesian probability of 7.64%. This performance divergence correlates with training data availability, as the liver dataset fell far below empirically established thresholds (>100-200 samples) for reliable classification. The liver models failure reveals limitations in transfer learning with insufficient data. These findings underscore critical implications for AI-enhanced digital pathology, demonstrating potential deployment of the GIT model as a promising one that supports tissue-specific model development.

15
Data Matters: The Impact of Data Curation in the Classification of Histopathological Datasets

Brito-Pacheco, D. A.; Giannopoulos, P.; Reyes-Aldasoro, C. C.

2026-04-17 pathology 10.64898/2026.04.16.26351016 medRxiv
Top 0.1%
4.8%
Show abstract

In this work, the impact of outliers on the performance of machine learning and deep learning models is investigated, specifically for the case of histopathological images of colorectal cancer stained with Haematoxylin and Eosin. The evaluation of the impact is done through the systematic comparison of one machine learning model (Random Forests) and one deep learning model (ResNet-18). Both models were trained with the popular NCT-CRC-HE-VAL-100K dataset and tested on the CRC-HE-VAL-7K companion set. Then, a curation process was performed by analysing the divergence of patches based on chromatic, textural and topological features of the training set and removing outliers to repeat the training with a cleaned dataset. The results showed that machine learning models, can benefit more from improvements in the quality of data, than deep learning models. Further, the results suggest that deep learning models are more robust to outliers as, through the training process, the architectures can learn features other than those previously mentioned.

16
Development and Evaluation of iSupport-Malaysia: A Multimedia Web-Based Psychoeducational Intervention for Dementia Caregivers

Loh, K. J.; Lee, W. L.; Ng, A. L. O.; Chung, F. F. L.; Renganathan, E.

2026-04-21 geriatric medicine 10.64898/2026.04.14.26350743 medRxiv
Top 0.2%
4.3%
Show abstract

BackgroundCaring for people with dementia can impose a considerable psychological burden on caregivers, yet access to caregiver support in Malaysia remains limited. The World Health Organizations iSupport for Dementia program provides dementia education via textual, e-learning format. However, a culturally adapted Malaysian version has not been available. ObjectiveThis study aimed to develop and gather user feedback on a culturally adapted, multimedia version of iSupport tailored for Malaysia (iSupport-Malaysia). MethodsGuided by a four-phase cultural adaptation framework, the generic iSupport content was translated into Bahasa Malaysia, adapted to local customs, and transformed into multimedia lessons on an e-learning platform. A mixed-methods design was used to explore user perceptions and evaluate usability through four homogeneous focus group discussions and 15 individual usability test sessions with informal caregivers (FG: n=9; UT: n=9) and healthcare professionals (FG: n=11; UT: n=6). Focus groups examined aesthetics, ease of use, clarity, cultural relevance, comprehensiveness, and satisfaction. Usability testing involved Think Aloud tasks, post-test questionnaires, and brief interviews. Qualitative data was analysed thematically, and descriptive statistics summarised usability performance. ResultsiSupport-Malaysia demonstrated good usability (M=74.3{+/-}18.0), with most tasks completed without assistance. Strengths included interactive learning activities, peer discussion features, and flexible self-paced learning. Content was viewed as culturally appropriate, credible, and useful. Suggested improvements included enhancing visual aesthetics, shortening videos, refining quizzes, and increasing practical relevance. ConclusionUser insights indicate that iSupport-Malaysia is usable and culturally appropriate. These findings will inform refinement of the platform prior to the pilot feasibility study and provide recommendations for future multimedia-based caregiver interventions.

17
Rapid protocol for mitochondria isolation from cardiomyocytes employing cell strainer-based procedure

Lewandowska, J.; Kalenik, B.; Szewczyk, A.; Wrzosek, A.

2026-04-06 biochemistry 10.64898/2026.04.02.716092 medRxiv
Top 0.2%
4.1%
Show abstract

AimsThe development of a method for isolating mitochondria from a specific cell type within a given tissue, while preserving their structural and functional integrity to the greatest possible extent, remains an ongoing challenge. The aim of this study was to establish a protocol for the isolation of mitochondria from rodent cardiomyocytes, characterized by minimal contamination with other cell types and a high yield of mitochondrial fractions originating from distinct subcellular regions of cardiomyocytes. Methods and resultsIn the present study, cardiomyocytes from guinea pig and rat hearts were isolated using a standard enzymatic digestion protocol in a Langendorff heart perfusion system. Traditionally, the isolation of organelles, including mitochondria, from whole cardiac tissue as well as from cardiomyocytes has relied primarily on mechanical tissue homogenization These conventional approaches involve the localized application of high pressure to cells, which may potentially damage delicate organelles, particularly mitochondria. Moreover, such homogenization preferentially releases mitochondria located in the subsarcolemmal region of cardiomyocytes rather than representing the entire mitochondrial population. In our study, we employed an alternative approach based on the gentle mechanical disruption of cardiomyocytes by passing the cell suspension through selected cell strainers using a cell scraper. This strategy facilitated mild disruption of cellular structures, significantly increasing the yield of mitochondria released from interfibrillar regions while preserving mitochondrial functionality. Moreover, this method decrease probability of sample contamination with mitochondria from other cells, based on cell size differences. The effectiveness of this method was confirmed by transmission electron microscopy, and high-resolution respirometry, which revealed no evidence of outer mitochondrial membrane damage, as indicated by the lack of response to the addition of exogenous cytochrome c to the incubation chamber. Moreover, mitochondrial oxygen consumption increased by 7.39 {+/-} 1.25-fold following the addition of 100 {micro}M ADP, reflecting efficient ADP-stimulated respiration. Furthermore, fluorescence measurements were performed. to assess changes in the mitochondrial inner membrane potential ({Delta}{Psi}). The isolated mitochondria were also suitable for electrophysiological studies using the single-channel patch-clamp technique. Additionally, mitochondria isolated using the protocol developed in our laboratory exhibited a high capacity for transplantation into H9c2 cells. ConclusionIn summary, our mitochondrial isolation method is rapid, efficient, and yields functionally competent mitochondria. These preparations are suitable for a wide range of downstream applications, including patch-clamp electrophysiology, analyses of oxygen consumption under various pharmacological conditions, as well as mitochondrial transplantation. Graphical abstract O_FIG O_LINKSMALLFIG WIDTH=162 HEIGHT=200 SRC="FIGDIR/small/716092v1_ufig1.gif" ALT="Figure 1"> View larger version (85K): org.highwire.dtl.DTLVardef@613495org.highwire.dtl.DTLVardef@1c34338org.highwire.dtl.DTLVardef@722900org.highwire.dtl.DTLVardef@e1f7a6_HPS_FORMAT_FIGEXP M_FIG C_FIG

18
Evaluation of a multiplexed tiling PCR scheme for whole-genome amplification of hepatitis B virus using Oxford Nanopore sequencing

Brate, J.; Grande, E. G.; Pedersen, B. N.; Frengen, T. G.; Stene-Johansen, K.

2026-03-31 molecular biology 10.64898/2026.03.28.714721 medRxiv
Top 0.2%
3.9%
Show abstract

Here we evaluated the performance of a previously published tiling PCR primer scheme by Ringlander et al. (2022) for whole-genome amplification of Hepatitis B virus (HBV) in combination with Oxford Nanopore sequencing. The primer set originally developed for Ion Torrent sequencing was adapted by removing platform-specific adapters and tested using clinical serum or plasma samples submitted for routine HBV genotyping and resistance testing. Two multiplexing strategies were compared: a single PCR pool containing all primers and a two-pool strategy with non-overlapping amplicons. Sequencing reads were processed using a Nanopore analysis pipeline, and genome coverage and amplicon performance were compared across samples spanning a wide Ct range and representing HBV genotypes A-E. Across all samples, the median genome coverage was approximately 50%, although recovery varied widely, ranging from complete failure to nearly full genomes. Combining all primers into a single PCR reaction, or separating overlapping amplicons into different reactions, had little overall impact on genome recovery, and no consistent differences between the two pooling strategies were observed. In contrast, amplification efficiency differed markedly between individual amplicons. Amplicons 1-5 generally produced higher sequencing depth, whereas amplicons 6-10 frequently showed low coverage and contributed to incomplete genome recovery. Genome coverage was strongly associated with Ct values, with higher coverage observed in samples with lower Ct values, while coverage was broadly similar across genotypes. These results demonstrate that the Ringlander et al. primer scheme can be adapted for multiplex PCR and Nanopore sequencing of HBV, but uneven amplicon performance limits consistent full-genome recovery and highlights the need for further optimization of HBV tiling PCR designs.

19
OGSCalc: Mathematical formulae and web-based application to incorporate rotational discrepancies into translational discrepancies for assessment of accuracy in orthognathic surgery

Hue, J.; Yeo, J.; Saigo, L.

2026-04-04 dentistry and oral medicine 10.64898/2026.04.03.26350094 medRxiv
Top 0.2%
3.8%
Show abstract

Objectives: Accurate assessment of orthognathic surgical accuracy is essential in the evaluation of operative techniques. Surgical accuracy is often reported as rotational and translational deviations from planned positions. This results in 6 separate values, translation in three planes, anterior-posterior (AP), superior-inferior (SI) and medial-lateral (ML) and rotations about three axes, pitch, roll and yaw. However, rotations will influence 3-dimensional positions and translational discrepancies. Methods: We have derived a mathematical formula using Euclidean geometry and quadratic functions that quantifies the impact of rotations on translational discrepancies. This allows for the calculation of a total discrepancy value that incorporates the three translations and rotations. Furthermore, we developed an interactive web-based application using the open-source shiny R package. Results: We have successfully reduced equations from Euclidean geometry into a quadratic form. The equation is as follows, [4(sin{theta}/2)2-2]x2 + [8d(sin{theta}/2)2-2d]x + 4d2(sin{theta}/2)2 = 0, where {theta} represents the rotational discrepancy in radians and d represents the translation discrepancy. This allows us to solve for the correction needed to be made to translational discrepancies to account for the influence of rotational discrepancies. We successfully developed a web application with a user-friendly graphical user interface. Clinicians upload their own data in the excel (.xlsx) file format and the application automatically performs the necessary calculations over many patients, returning a downloadable table of results. Conclusion: We present a mathematical formula incorporated into a web-application to combine translational and rotational discrepancies for deeper insight when evaluating orthognathic surgical accuracy. Clinical Relevance: This allows surgeons to account for rotational influence on 3-dimensional translational discrepancies.

20
Characteristic resting state facial expressions in older adults with mild cognitive impairment level

Miyayama, M.; Sekiguchi, T.; Sugimoto, H.; Kawagoe, T.; Tripanpitak, K.; Wolf, A.; Kumagai, K.; Fukumori, K.; Miura, K. W.; Okada, S.; Ishimaru, K.; Otake-Matsuura, M.

2026-04-11 geriatric medicine 10.64898/2026.04.10.26350581 medRxiv
Top 0.2%
3.7%
Show abstract

Background: For early detection of Alzheimer's disease, it is essential to identify individuals showing cognitive performance consistent with the mild cognitive impairment (MCI) range during preliminary screening, ideally using methods that extend beyond conventional cognitive assessments. Non-invasive, easily accessible screening tools applicable in daily life are increasingly needed. Facial expressions, particularly during rest, may offer promising biomarkers for MCI level detection. This study aimed to identify specific facial features associated with MCI level during rest to inform development of facial expression-based screening tools. Methods: Participants were classified into an MCI level group and a healthy control (HC) group based on the Montreal Cognitive Assessment (MoCA) scores. Facial Action Units (AUs) were extracted from video recordings of resting-state facial expressions in 31 individuals with MCI level and 14 HC. Two statistical models were employed: a multilevel zero-inflated beta regression model for intensity of 17 AUs and a multilevel logistic regression model for presence or absence of 18 AUs. Results: In the zero-inflated beta regression, the AU relates to upper lip raiser showed a significant group effect (MCI level vs. HC; p <0.001), remaining significant after multiple comparison correction. The logistic regression revealed significant group differences for the AUs related to lip tightener (p <0.001) and lip suck (p <0.001), both remained significant after multiple comparison correction. Conclusions: Distinctive facial action patterns during rest were observed in individuals with MCI level. These findings highlight the potential of resting-state facial expressions as a basis for novel, unobtrusive screening tools for early MCI level detection.